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CHAPTER II 

Recurrent Neural Networks 

In this chapter first the dynamics of the continuous space recurrent neural networks will 
be examined in a general framework. Then, the Hopfield NetWork as a special case of 
this kind of networks will be introduced. 

2.1. Dynamical Systems 

The dynamics of a large class of neural network models, may be represented by a set of 
first order differential equations in the form 

d x(t) = F J (x ] (tU 2 (t),..,x ] (t),..,xAt)) j = l-N (2.1.1) 

at 

where Fj is a nonlinear function of its argument. 

In a more compact form it may be reformulated as 

4x(0 = F(x(0) (2.1.2) 

at 

where the nonlinear function F operates on elements of the State vector x(t) in an 
autonomous way, that is F (x(t)) does not depend explicitly on time t. F(x) is a vector 


EE543 LECTURE NOTES . METU EEE . ANKARA 



Ugur HALICI 


ARTIFICIAL NEURAL NETWORKS 


CHAPTER 2 


field in an N-dimensional State space. Such an equation is called State space equation and 
x(t) is called the State of the system at particular time t. 

In order the State space equation (2.1.2) to have a solution and the solution to be unique, 
we have to impose certain restrictions on the vector function F(x(t)). For a solution to 
exist, it is sufficient that F(x) is continuous in all of its arguments. However, this 
restriction by itself does not guarantee the uniqueness of the solution, so we have to 
impose a further restriction, known as Lipschitz condition. 

Let ||x|| denotes a norm, which may be the Euclidean length, Hamming distance or any 
other one, depending on the purpose. 

Let x and y be a pair of vectors in an open set h, in vector space. Then according to the 
Lipschitz condition, there exists a constant k such that 

|| F(x) - F(y)|| < k|| x - y || (2.1.3) 

for all x and y in h. A vector F(x) that satisfies equation (2.1.3) is said to be Lipschitz. 
Note that Eq. (2.1.3) also implies continuity of the function with respect to x. Therefore, 
in the case of autonomous systems the Lipschitz condition guarantees both the existence 
and uniqueness of solutions for the State space equation (2.1.2). In particular, if all partial 
derivatives 8 Fj(x)/dxj are finite everywhere, then the function F(x) satisfies the Lipschitz 
condition [Haykin 94], 

Exercise: Compare the definitions of Euclidean length and Hamming distance 


2.2. Phase Space 


Regardless of the exact form of the nonlinear function F, the State vector x(t) varies with 
time, that is the point representing x(t) in N dimensional space, changes its position in 
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time. While the behavior of \(t) may be thought as a flow, the vector fiinction F(x), may 
be thought as a velocity vector in an abstract sense. 

For visualization of the motion of the States in time, it may be helpful to use phase space 
of the dynamical system, which describes the global characteristics of the motion rather 
than the detailed aspects of analytic or numeric solutions of the equation. 

At a particular instant of time t, a single point in the A-dimensional phase space 
represents the observed State of the State vector, that is x(7). Changes in the State of the 
system with time t are represented as a curve in the phase space, each point on the curve 
carrying (explicitly or implicitly) a label that records the time of observation. This curve 
is called a trajectory or orbit of the system. Figure 2.1.a. illustrates a trajectory in a two 
dimensional system. 

The family of trajectories, each of which being for a different initial condition x(0), is 
called the phase portrait of the system (Figure 2.1.b). The phase portrait includes all those 
points in the phase space where the field vector F(x) is defined. For an autonomous 
system, there will be one and only one trajectory passing through an initial State 
[Abraham and Shaw 92, Haykin 94], The tangent vector, that is dx(t)/dt, represents the 
instantaneous velocity F (x(f)) of the trajectory. We may thus derive a velocity vector for 
each point of the trajectory. 
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Figure 2.1. a) A two dimensional trajectory b) Phase portrait 


EE543 LECTURE NOTES . METU EEE . ANKARA 


17 



Ugur HALICI 


ARTIFICIAL NEURAL NETWORKS 


CHAPTER 2 


2.3. Major forms of Dynamical Systems 

For fixed weights and inputs, we distinguish three major form s dynamical system, 
[Bressloff and Weir 91]. Each is characterized by the behavior of the network when t is 
large, so that any transients are assumed to have disappeared and the system has settled 
into some steady State (Figure 2.2): 



Figure 2.2. Three major forms of dynamical systems 
a) Convergent b) Oscillatory c) Chaotic 


a) Convergent: every trajectory x(t) converges to some fixed point, which is a State that 
does not change over time (Figure 2.2. a). These fixed points are called the attractors of 
the system. The set of initial States x(0) that evolves to a particular attractor is called the 
basin of attraction. The locations of the attractors and the basin boundaries change as the 
dynamical system parameters change. For example, by altering the extemal inputs or 
connection weights in a recurrent neural network, the basin attraction of the system can 
be adjusted. 

b) Oscillatory: every trajectory converges either to a cycle or to a fixed point. A cycle of 
period T satisfies x(t+T)=x(t) for all times t (Figure 2.2.b) 

c) Chaotic: most trajectories do not tend to cycles or fixed points. One of the 
characteristics of chaotic systems is that the long-term behavior of trajectories is 
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extremely sensitive to initial conditions. That is, a slight change in the initial State x(0) 
can lead to very different behaviors, as t becomes large. 


2. 4. Gradient, Conservative and Dissipative Systems 

For a vector field F(x) on state space \(t) eR^, the V operator helps in formal 
description of the system. In faet, V is an operational vector defined as: 




(2.4.1) 


If the V operator applied on a scalar function E of vector x(t), that is 


VE= 


[ dE/ dE/ dE/ 

L / dX\ / dx 2 ' ' ' / dx N 


(2.4.2) 


is called the gradient of the function E and extends in the direction of the greatest rate of 
change of E and has that rate of change for its length. 

If we set E(x)=c, we obtain a family of surfaces known as level surfaces of E, as x takes 
on different values. Due to the assumption that E is single valued at each point, one and 
only one level surface passes through any given point P [Wylie and Barret 85]. The 
gradient of E(x) at any point P is perpendicular to the level surface of E, which passes 
through that point. (Figure 2.3) 

For a vector field 

F(x)=[Fj(x) F 2 (x) ... F^(x)]T (2.4.3) 
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dF l , , dF N 

dx2 dx-tf 


(2.4.4) 


is called the divergence of F, and it has a scalar value. 

Consider a region of volume Y and surface S in the phase space of an autonomous 
system, and assume a flow of points from this region. From our earlier discussion, we 
recognize that the velocity vector dx/dt is equal to the vector field F(x). Provided that the 
vector field F(x) within the volume Y is "well behaved", we may apply the divergence 
theorem from the vector calculus [Wylie and Barret 85, Haykin 94], Let n denote a unit 
vector normal to the surface at dS pointing outward from the enclosed volume. Then, 
according to the divergence theorem, the relation 

| (F(x)ji)dS =| (V.F(x))JF (2.4.5) 


holds between the volume integral of the divergence of F(x) and the surface integral of 
the outwardly directed normal component of F(x). The quantity on the left-hand side of 
Eq. (2.4.5) is recognized as the net flux flowing out of the region surrounded by the 
closed surface S. If the quantity is zero, the system is conservative; if it is negative, the 
system is dissipative. In the light of Eq. (2.4.5), we may State equivalently that if the 
divergence 


V-F(x) = 0 (2.4.6) 

then the system is conservative and if 

V -F(x) < 0 (2.4.7) 

the system is dissipative, which implies the stability of the system [Haykin 94], 
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2.5. Equilibrium States 

A constant vector x* satisfying the condition 

F(x*) = 0, (2.5.1) 


is called an equilibrium State ( stationary State or fixed point) of the dynamical system 
defined by Eq. (2.1.2). Since it results in 


dx;_ 

dt 


= 0 for i = l..N, 
x * 


(2.5.2) 


the constant function x(t)=x* is a solution of the dynamical system. If the system is 
operating at an equilibrium point, then the State vector stays constant, and the trajectory 
with an initial State x(0)=x* degenerates to a single point. 

We are frequently interested in the behavior of the system around the equilibrium points, 
and try to investigate if the trajectories around the equilibrium points are converging to 
the equilibrium point, diverging from it or staying in an orbit around the point or 
combination of these. 


The use of a linear approximation of the nonlinear function F(x) makes it easier to 
understand the behavior of the system around the equilibrium points. Let x=x*+Ax be a 
point around x*. If the nonlinear function F(x) is smooth and if the disturbance Ax is 
small enough then it can be approximated by the first two term s of its Taylor expansion 
around x* as: 


F(x * +Ax) = F(x*) + F'(x*) Ax 


(2.5.3) 
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where 


F'(x*) = — F| 

o x X=X* 


that is, in particular: 


F$(x*) = 


dFjjx) 

dxf 


(2.5.4) 


(2.5.5) 


Notice that F(x*) and F '(x*) in Eq. (2.5.3) are constant, therefore it is a linear equation 
in term s of Ax. 

Furthermore, since an equilibrium point satisfies Eq. (2.5.1), we obtain 

F(x *+Ax) = F'(x*)Ax (2.5.6) 

On the other hånd, since 

— (x*+Ax) = — Ax (2.5.7) 

dt dt 

the Eq. (2.1.2) becomes 

— Ax = F'(x*)Ax (2.5.8) 

dt 

Since Eq. (2.5.8) defines a homogenous differential equation with constant real 
coefficient, the eigenvalues of the matrix F ’(x*) determines the behavior of the system. 
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Exercise: Give the general form of the solution of the system defined by Eq. (2.5.8) 

Notice that, in order to have Ax(t) to diminish as t— »oo, we need the real parts of all the 
eigenvalues to be negative. 

2.6. Stability 

An equilibrium State x* of an autonomous nonlinear dynamical system is called stable, if 
for any given positive s, there exists a positive 8 satisfying, 

||x(0)-x*|| < 8 => ||x(f)-x*|| < e for all t> 0 . (2.6.1) 

If x* is a stable equilibrium point, it means that any trajectory described by the State 
vector \(t) of the system can be made to stay within a small neighborhood of the 
equilibrium state x* by choosing an initial State x(0) close enough to x*. 

An equilibrium point x is said to be asy mptotic ally stable if it is also convergent, where 
convergence requires the existence of a positive 8 such that 

||x(0)-x*|| < 8 => lim^oo x(t) = x * . (2.6.2) 

If the equilibrium point is convergent, the trajectory can be made approaching to x as t 
goes to infinity, by choosing again an initial state x(0) close enough to x . Notice that 
asymptotically stable States correspond to attractors of the system. 

For an autonomous nonlinear dynamical system the asymptotic stability of an equilibrium 
state x* can be decided by the existence of energy functions. Such energy functions are 
called also as Liapunov functions since they are discovered by Alexander Liapunov in 
the early 1900s to prove the stability of differential equations. 
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A continuous function L(x) with a continuous time derivative L'(x)=dL(x)/dt is a definite 
Liapunov function if it satisfies: 

a) L(x) is bounded 

b) L'(x) is negative definite, that is: 

L'(x)<0 for x*x* (2.6.3) 


and 


L'(x)=0 for x=x* 


(2.6.4) 


If the condition (2.6.3) is in the form 

L'(x) < 0 forx * x * (2.6.5) 

the Liapunov function is called semidefinite. 

Having defined the Liapunov function, the stability of an equilibrium point can be 
decided by using the following theorem: 

Liapunov’s Theorem: The equilibrium State x is stable (asymptotically stable), if there 
exists a semidefinite (definite) Liapunov function in a small neighborhood of x . 

The use of Liapunov functions makes it possible to decide the stability of equilibrium 
points without solving the state-space equation of the system. Unfortunately there is not 
any formal way to find a Liapunov function, mostly it is determined in a trial and error 
fashion. If we are able to find a Liapunov function, then we State the stability of the 
system. However, the inability to find a Liapunov function, does not imply the instability 
of the system. 
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Often convergence of neural networks is guaranteed by introducing an energy function 
together with the network itself. In faet the energy funetions are Liapunov funetions, so 
they are non-inereasing along trajectories. Therefore the dynamics of the network can be 
visualized in term s of some multidimensional 'energy landscapes' as given previously in 
Figure 2.3. The attractors of the dynamical system are the local minima of the energy 
function surrounded with ’valleys' corresponding to the basins of attraction (Figure 2.4). 



Figure 2.4. Energy landscape and basin attractions 


2.7. Effect of input and initial state on the attraction 

The convergence of a network to an attractor of the activation dynamics may be viewed 
as a retrieval process in which the fixed point is interpreted as the output of the neural 
network. As an example consider the following network dynamic: 

y x,. (0 = -x ; . (0 + f t (£ w^j + 6 i ) (2.7.1) 
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Assume that the weight matrix W is fixed and the network is specified through 0 and 
initial State x(0). Both 0 and x(0) are ways of introducing an input pattern into the 
network, although they play distinet dynamical roles [Bressloff and Weir 91]. 

We then distinguish two modes of operation, depending on whether network has fixed 
x(0) but 0=u or it has fixed 0 but x(0)= u. In the first case, the vector u acts as input and 
the initial State is set to some constant vector for all inputs. In general, the value of the 
attractors vary smoothly as the vector u is varied, hence the network provides a 
continuous mapping between the input and the output spaces (Figure 2.5.a). However this 
will breakdown if, for example, the initial point x(0) crosses the boundary of a basin 
attraction for some input. Such a scenario can be avoided by making the network globally 
convergent, which means that all the trajectories converge to a unique attractor. 

In such a network, if the initial State is not set to the same fixed vector, it may give 
different responses to the same input pattem on different occasions. Although such a 
feature may be desirable when considering temporal sequences, it makes the network 
unsuitable as a classifier. 



x(0) 

x(0) 

X(0) 


r - X* 

X* 



x(0) 

X* 

X* 


Figure 2.5. Both extemal input u and initial value x(0) has effeets the final State 
a) The same initial value x(0) may result in different fixed points as final value for different u 
b) Different x(0) may converge to different fixed values although u is the same 


In the second mode of operation, the input pattern is presented to the network through the 
initial State x(0) while 0 is kept fixed. The attractors of the dynamics may be used to 
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represent items in a memory while the initial States are the stimulus to remember the 
stored memory items. The initial States that contain incomplete or erronous information 
may be considered as queries to the memory. The network then converges to the 
complete memory items that hest fits the stimulus (Figure 2.5.b). Thus in contrast to the 
First mode of operation, which ideally uses a globally convergent network, this form of 
operation exploits the faet that there are many basins of attraction to act as a content 
addressable memory. However, as a result of inappropriate choices for the weights, there 
may be a complication arising from the fixed points of the network where the memory 
items are indented to reside. Although the intention is to have fixed points corresponding 
only to the stored memory items, it may appear undesired fixed points called spurious 
States [Bresloff and Weir 91]. 


2.8 Cohen-Grossberg Theorem 


The idea of using energy functions to analyze the behavior neural networks was 
introduced during the first half of the 1970s independently in [Amari 72], [Grossberg 
72] and [Little 74], A general principle, known as Cohen-Grossberg theorem is based on 
the Grossberg's studies during the previous decade. As described in [Cohen and 
Grossberg 83] it is used to decide the stability of a certain class of neural networks. 

Theorem: Consider a neural network with N processing elements having output signals 
fiiaj) and transfer functions of the form 

~T a i = a z (a z )(/? z (a t ) - ^ w jif / ( a j )) i = (2.8.1) 

dt M 

satisfying constraints: 

1) Matrix W=[w,y| is symmetric, that is w i} =Wji, and all iv,y>0 

2) Function afa) is continuous and a fa) > 0 for a > 0 
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3) Function f(a) is differentiable and f'Aa) = — (/. (a)) > 0 for a > 0 

da 

4) (P/(a,)-w//)<0 as a t 

5) Function (3/(a) is continuous for a>0 

6) Either lim/? ; (a) = oo or lim J3 i (a) < oo but f — - — ds = co for some a >0. 

«*•# a->0 + JO a .(s) 

If the network's State a(0) at time 0 is in the positive orthant of R n (that is a/(0)>0 
/= /..AO, then the network will almost certainly converge to some stable point also in the 
positive orthant. Further, there will be at most a countable number of such stable points. 


Here the statement that the network will "almost certainly" converge to a stable point 
means that this will happen except for certain rare choices of the weight matrix. That is, 
if weight matrix W is chosen at random among all possible W choices, then it is virtually 
certain that a bad one will never be chosen. 

In Eq. (2.8.1), which is describing the dynamic behavior of the system, a faj) is the 
control parameter for the convergence rate. The decay function (3/(a/) allows us to place 
the system's stable attractors in appropriate positions of the State space. 

In order to prove the part related to the stability of the system, the theorem uses Liapunov 
function approach by showing that, under the given conditions, the function 

£ = + F S 2 A (a, ) A («_,)- 2 J ’/rø/'M* (2.8.2) 


is an energy function of the system. That is, E has negative time derivative on every 
possible trajectory that the network's State can follow. 

By using the condition (1), that is W is symmetric, the time derivative of the energy 
function can be written as 
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^ = (2-8.3) 

and it has negative value for a* a* whenever conditions (2) and (3) are satisfied. 

The condition (4) guarantees that any a* to has a finite value, preventing them to 
approach infinity. 

The rest of the conditions (the condition wy > 0 in (1) and the conditions (5) and (6)) are 
requirements to prove that the solution always stays in the positive orthant and satisfies 
some other detailed mathematical requirements, which are not so easy to show, requires 
some sophisticated mathematics. 

While the converge to a stable point in the positive orthant is important for a model 
resembling a biological neuron, we do not care such a condition for artificial neurons as 
long as they converge to some stable point having finite value. Whenever the function / 
is a bounded, that is |/(a)|<c for some positive constant c, any state can not take infinite 
value. However for the stability of the system still remain the constraints: 

a) Symmetiy: 

Wji = Wg i,j = l..N (2.8.4) 


b) Nonnegativity: 


a t (a) > 0 i = 1 ..N 


(2.8.5) 
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c) Monotonocity: 

f'(a) = -f(f(a))> 0 for a> 0 (2.8.6) 

da 

This form of Cohen-Grossberg theorem States that if the system of nonlinear equations 
satisfies the conditions on symmetry, nonnegativity and monotonocity, then the energy 
function defined by Eq. (2.8.1) is a Liapunov function of the system satisfying 

dF 

— <0 for a i *a* (2.8.7) 

dt ' ' 

and the global system is therefore asymptotically stable [Haykin 94], 

2.9 Hopfield Network 

The continuous deterministic Hopfield model which is based on continuous variables 
and responses, is proposed in [Hopfield 84] to extend their discrete model of the 
processing elements [Hopfield 82] to resemble actual neurons more closely. In this 
extended model, the neurons are modeled as amplifiers in conjunction with feedback 
circuits made up of wires, resistors and capacitors which suggests the possibility of 
building these circuits using VLSI technology. The Circuit diagram of the continuous 
hopfield network is given in Figure 2.6. This circuit has a nerobiological ground as 
explained below: 
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Figure 2.6 Hopfield Network made of electronical components 


• C[ is the total input capacitance of the amplifier representing the capacitance of cell 
membrane of neuron i, 

• pi is input conductance of the amplifier representing the transmembrane 
conductance of neuron i 

• wji is the value of the conductance of the connection from the output of the 
amplifier to the input of the i th amplifier, representing finite conductance between 
the output of neuron j and the cell body of neuron i. 

• a[(t) is the voltage at the input of the amplifier representing the soma potential of 
neuron i 

• x[(t) is the output voltage of the z th amplifier representing the short-term average of 
firing rate of neoron i 

• di is the current due to extemal input feeding the amplifier input representing the 
threshold for activation of neuron. 

The output of the amplifier, x/, is a continuous, monotonically increasing function of the 
instantaneous input a/ to the z' th amplifier. The input-output relation of the z th amplifier is 
given by 


fi(a) = tanh (jCjCi) 


(2.9.1) 
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where k; is constant called the gain parameter. 
Notice that, since 


tanh(x) = - — 

e x + é~ x 

the amplifier transfer function is in faet a sigmoid function 


(2.9.2) 


fi(a) = 


l + e K ' a 


(2.9.3) 


as given in equation (1.2.8) with k -2k, but shifted so that to have values between -1 and 
+1. In Figure 2.7, the transfer function is illustrated for several values of k. This function is 
differentiable at each point and always has positive derivative. In particular, its derivative 
at origin gives the gain k/, that is 



Figure 2.7 Output function used in Hopfield network 
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Kj 


JA\ 

rir, !«=0 


(2.9.4) 


The amplifiers in the Hopfield Circuit correspond to the neurons. A set of nonlinear 
differential equations describes the dynamics of the network. The input voltage a, of the 
amplifier i is determined by the equation 

q Ap = -U-a,(f) + 2 «„/,(.<, (/II + 0, (2.9.5) 


while the output voltage is 

x i =f i (a l ) (2.9.6) 

In Eq. (2.9.5) R t is determined as 1 /R,= p l +’L j w j i 

The State of the network is described by an N dimensional State vector where N is the 
number of neurons in the network. The i th component of the State vector is given by the 
output value of the i th amplifier taking real values between -1 and 1. The State of the 
network moves in the State space in a direction determined by the above nonlinear dynamic 
equation (2.9.5). 


Based on the neuron characteristics given above, Hopfield network can be represented by a 
neural network as shown in Figure 2.8. 

m 

1 ( 2 ) (i ) J N ) 



x 

Figure 2.8 Hopfield Network made of neurons 
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The energy function for the continuous Hopfield model is given by the formula 


E = H w ji x j x i + Hl\ ]fi r \ x ) dx -H G i x i (2.9.7) 


where ff ' is the inverse of the function ff, that is 


f t \ x i) = a i 


(2.9.8) 


In particular, for the transfer function defined by the equation (2.9.3), we have 


f (x) = -ln- 

1 I + x 


(2.9.9) 


which is shown in Figure 2.9. 


a=f\x) 


-1 0 +1 


Figure 2.9 Inverse of the output function 
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Exercise: What happens to the system's energy if sigmoid function is used instead of tanh 
function. 

In [Hopfield 84], it is shown that the energy function given in equation 2.9.7 is an 
appropriate Lyapunov function for the system ensuring that the system eventually reaches a 
stable configuration if the network has symmetric connections, that is Wij=Wji. Such a 
network always converges to a stable equilibrium state, where the outputs of the units 
remain constant. 

For energy E of the Hopfield network to be a Lyapunov function, it should satisfy the 
following constraints: 

a) E(x) is bounded 


Because the function tanh(Ka) is used in the system as the output function, it limits the state 
variable to take value between -I<x/<1. Furthermore, because the integral of the inverse of 
this function is bounded if -1 <jc,< 1, the energy function given by Eq. (2.9.7) is bounded. 

In order to show that the time derivative of the energy function is always less than or equal 
to zero, we differentiate E with respect to time, 


dE_ 

dt 


ZZ 


d-X: dx j dx, n dXj 1 

“■'* V * (2 - 9 ' 10) 


Since we assumed wji=Wy, we have 


dE ’ 
dt 


X ( Yj w ji x j r. fi ( X i ) + &i ) J t 


(2.9.11) 
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By the use of equations (2.9.6) and (2.9.8) in (2.9.5), we obtain 


Iw 


-ic,fr^o>=a^ 


Therefore Eq. (2.9.1 1) results in 


dE _ sr r ^ a i ^ x i 
~dt~~y Li ~dt~dt 


On the other hånd notice that, by the use of Eq. (2.9.8) we have 


daj 

dt 


dx 1 dt 


2 

dt y 1 dx { dt } 


Due to equation (2.9.9) we have 


dfr\x) 

dx 


>0 


for any value of x. So Eq. (2.9.15) implies that, 


dE 

~dt 


<0 


(2.9.12) 


(2.9.13) 


(2.9.14) 


(2.9.15) 


(2.9.16) 


(2.9.17) 
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Therefore the energy function described by equation (2.9.7) is a Lyapunov function for the 
Hopfield network when the connection weights are symmetrical. This means that, whatever 
the initial State of the network is, it will converge to one of the equilibrium States depending 
on the basin attraction in which the initial State lies. 

Another way to show that the Hopfield network is stable is to apply the Cohen-Grossberg 
theorem given in section 2.8. For this purpose we reorganize the Eq. (2.9.5) as: 

1 1 i 

If we compare Eq. (2.9.18) with Eq. (2.8.1) we recognize that in faet Hopfield network is a 
special case of the system defined in Cohen-Grossberg theorem: 

«/ («/)<-►— (2.9.19) 

C i 

and 
and 

Wy <-» -Wij 

satisfying the conditions on 

a) symmetry, because Wij=Wy implies 

-Wij=-Wji (2.9.22) 

b) nonnegativity, because 


(2.9.20) 

(2.9.21) 
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a i (a i ) = —>0 (2.9.23) 

C i 

c) monotonocity, because 

(a) = — tanh^a) > 0 (2.9.24) 

dt 

Therefore, according to the Cohen-Grossberg theorem, the energy function defined as 
£ = ;ZE(-%)A», )/>,)-£! '(-ia+e,)f,\a)da (2.9.25) 


is a Lyapunov function of the Hopfield network and the network is globally asymptotically 
stable. 

In faet, the energy equation defined by equation (2.2.25) may be organized as 


£ = W ijf( a i)f( a j) + Tj 0 a f( a )da 

°i\ J\ a ) da 


(2.9.26) 
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£ = w v x i x J ; X Ø t*i (2. 9. 29) 


which is the same as the energy function defined by the Eq. (2.9.7). As the time derivative 
of the Energy function is negative, the change in the State value of the network is in a 
direction where the energy decreases. The behavior of a Hopfield network of two neurons 
is demonstrated in the Figure 2.10 [Hopfield 84], In the figure the ordinate and absisca are 
the outputs of each neuron. The network has two stable States and they are located near the 
upper left and lower right comers, marked by x in the figure. 



The second term of the energy function in Eq. (2.9.7), which is 

N 1 X 1 


(2.9.30 ) 
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alters the energy landscape. The value of the gain parameter determines how close the 
stable points come to the hypercube corners. In the limit of very high gain, k— »oo, this term 
approaches to zero and the stable points of the system lie just at the comers of the 
Hamming hypercupe where the value of each state component is either -1 or 1. For finite 
gain, the stable points move towards the interior of the hypercube. As the gain becomes 
smaller these stable points gets closer. When k=0, only a single stable point exists for the 
system. Therefore the choice of the gain parameter is quite important for the success of the 
operation [Freeman 91]. 
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2.10. Discrete time representation of recurrent networks 

Consider the dynamical system defined by the Eq. (2.1.2). The change A x(t) in the value of 
x(t) in a small amount of time At can be approximated as: 

A(x(t)) = F(x(0)At (2.10.1) 

Hence the value of x(M-At) in terms of this amount of change, 

x(t+At)=x(t)+ A(x(t)) (2.10.2) 

becomes 

x(t+A0=x(0+F(x(0)At. (2.10.3) 

Therefore, if we start with t=0 and observe the output value at each time elapse of At, then 
the value of x (t) at k^ observation may be expressed by using the value of the previous 
observation as 

x(k)=x{k - 1 )+F(x(A> 1 ))At k= 1,2 ... (2.10.4) 

or equivalently, 

x(k)=x(k - 1 )+r)F(x(A:- 1 )) k= 1,2 ... (2.10.5) 

where r\ is used instead of At to represent the approximation step size and it should be 
assigned a small value for a good approximation. However, depending on the properties of 
the function F, the system may also be represented by other discrete time equations. 
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For example, for the continuous time continuous State Hopfield network described in by 
Equation (2.9.5) we may use the following discrete time approximation: 

x(*)=x(*-l)+ 77 [-x(£-l)+tanh(K(W T x(A:-l)+0))], *=1,2... (2.10.6) 

However in Section 4.3 we will examine a special case of the Hopfield network where the 
State variables are forced to take discrete values in binary state space. So the discrete time 
dynamical system representation given by Eq. 2.10.6 will further be modified, by using 
sign(W^x(k- l )+0) where sign is a special case of the sigmoid function in which the gain is 
infinity. 

We have observed that the stability of continuous time dynamical systems described by Eq. 
2.1.1 are implied by the existence of a bounded energy function with a time derivative that 
is always less or equal to zero. The States of the network resulting in zero derivatives are 
the equilibrium States. 

Analogously, for a discrete time neural network with an excitation 

\(k)=x(k- 1 )+G(x(*- 1 )) *=1,2 ... (2.10.7) 

the stability of the system is implied by the existence of a bounded energy function so that 
the difference in the value of the energy should always be negative as the system changes 
States. 

When G(x)=r|F(x) the stable States of the continuous and discrete time systems described 
by equations 2.1.1 and 2.10.7 respectively are almost the same for small values of r\. 
However if r\ is not small enough, some stable States of the continuous system may 
becomes unreachable in discrete time. 
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